Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: CTE queries with non-SELECT statements #25014

Merged
merged 5 commits into from
Aug 19, 2023

Conversation

dpgaspar
Copy link
Member

@dpgaspar dpgaspar commented Aug 17, 2023

SUMMARY

Improves SQLLab parse check for non SELECT statements for CTE.

The following statement was being accepted as a valid SELECT statement:

WITH a AS ( INSERT INTO foo (id) VALUES (1) RETURNING id ) SELECT * FROM a;

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'm worried about all the possible edge case we're missing.

I wonder if we should also use sqloxide to prevent false negatives. We'd try to parse with sqloxide, check for non-selects, return false if we do. If we don't, we then run the current flow with sqlparse.

>>> parse_sql("WITH a AS ( INSERT INTO foo (id) VALUES (1) RETURNING id ) SELECT * FROM a;", "ansi")
[{'Query': {'with': {'recursive': False, 'cte_tables': [{'alias': {'name': {'value': 'a', 'quote_style': None}, 'columns': []}, 'query': {'with': None, 'body': {'Insert': {'Insert': {'or': None, 'into': True, 'table_name': [{'value': 'foo', 'quote_style': None}], 'columns': [{'value': 'id', 'quote_style': None}], 'overwrite': False, 'source': {'with': None, 'body': {'Values': {'explicit_row': False, 'rows': [[{'Value': {'Number': ('1', False)}}]]}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}, 'partitioned': None, 'after_columns': [], 'table': False, 'on': None, 'returning': [{'UnnamedExpr': {'Identifier': {'value': 'id', 'quote_style': None}}}]}}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}, 'from': None}]}, 'body': {'Select': {'distinct': False, 'top': None, 'projection': [{'Wildcard': {'opt_exclude': None, 'opt_except': None, 'opt_rename': None, 'opt_replace': None}}], 'into': None, 'from': [{'relation': {'Table': {'name': [{'value': 'a', 'quote_style': None}], 'alias': None, 'args': None, 'with_hints': []}}, 'joins': []}], 'lateral_views': [], 'selection': None, 'group_by': [], 'cluster_by': [], 'distribute_by': [], 'sort_by': [], 'having': None, 'qualify': None}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}}]

@dpgaspar
Copy link
Member Author

Looks good, but I'm worried about all the possible edge case we're missing.

I wonder if we should also use sqloxide to prevent false negatives. We'd try to parse with sqloxide, check for non-selects, return false if we do. If we don't, we then run the current flow with sqlparse.

>>> parse_sql("WITH a AS ( INSERT INTO foo (id) VALUES (1) RETURNING id ) SELECT * FROM a;", "ansi")
[{'Query': {'with': {'recursive': False, 'cte_tables': [{'alias': {'name': {'value': 'a', 'quote_style': None}, 'columns': []}, 'query': {'with': None, 'body': {'Insert': {'Insert': {'or': None, 'into': True, 'table_name': [{'value': 'foo', 'quote_style': None}], 'columns': [{'value': 'id', 'quote_style': None}], 'overwrite': False, 'source': {'with': None, 'body': {'Values': {'explicit_row': False, 'rows': [[{'Value': {'Number': ('1', False)}}]]}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}, 'partitioned': None, 'after_columns': [], 'table': False, 'on': None, 'returning': [{'UnnamedExpr': {'Identifier': {'value': 'id', 'quote_style': None}}}]}}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}, 'from': None}]}, 'body': {'Select': {'distinct': False, 'top': None, 'projection': [{'Wildcard': {'opt_exclude': None, 'opt_except': None, 'opt_rename': None, 'opt_replace': None}}], 'into': None, 'from': [{'relation': {'Table': {'name': [{'value': 'a', 'quote_style': None}], 'alias': None, 'args': None, 'with_hints': []}}, 'joins': []}], 'lateral_views': [], 'selection': None, 'group_by': [], 'cluster_by': [], 'distribute_by': [], 'sort_by': [], 'having': None, 'qualify': None}}, 'order_by': [], 'limit': None, 'offset': None, 'fetch': None, 'locks': []}}]

nice!, will try that!

@pull-request-size pull-request-size bot added size/L and removed size/M labels Aug 18, 2023
Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@dpgaspar dpgaspar merged commit 3579861 into apache:master Aug 19, 2023
29 checks passed
@dpgaspar dpgaspar deleted the fix/cte-with-non-selects branch August 19, 2023 14:49
@michael-s-molina michael-s-molina added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 21, 2023
michael-s-molina pushed a commit that referenced this pull request Aug 21, 2023
@mistercrunch mistercrunch added 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.1.2 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L v2.1 v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 2.1.2 🍒 2.1.3 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants